Mon. Not. R. Astron. Soc. 000, 000-000 (0000) 



Printed 29 March 2011 



(MN style file v2.2) 



The Local Bias Model in the Large Scale Halo Distribution 



M.Manera^'^ & E.Gaztanaga^ 

^Institute of Cosmology and Gravitation, University of Portsmouth, Dennis Sciama Building, Burnaby Road, Portsmouth POl 3FX, UK 
^ Center for Cosmology and Particle Physics, New York University, 4 Washington Place, NY 1003, New York, USA 
^ Institut de Ciencies de I'Espai, CSIC/IEEC, Campus UAB, F. de Ciencies, Torre C5 par-2, Barcelona 08193, Spain 



29 March 2011 



ABSTRACT 

We explore the biasing in the clustering statistics of hales as compared to dark matter 
(DM) in simulations. We look at the second and third order statistics at large scales 
of the (intermediate) MICEL1536 simulation and also measure directly the local bias 
relation h — f{5) between DM fluctuations, 5, smoothed over a top-hat radius Rs at 
a point in the simulation and its corresponding tracer h (i.e. halos) at the same point. 
This local relation can be Taylor expanded to define a linear (6i) and non- linear (62) 
bias parameters. The values of bi and &2 in the simulation vary with Rs approaching 
a constant value around Rs > 30 — 60 Mpc/h. We use the local relation to predict the 
clustering of the tracer in terms of the one of DM. This prediction works very well 
(about percent level) for the halo 2-point correlation ^(^12) for ri2 > 15 Mpc/h, but 
only when we use the biasing values that we found at very large smoothing radius 
Rs > 30 — 60 Mpc/h. We find no effect from stochastic or next to leading order terms 
in the f{d) expansion. But we do find some discrepancies in the 3-point function that 
needs further understanding. We also look at the clustering of the smoothed moments, 
the variance and skewness which are volume average correlations and therefore include 
clustering from smaller scales. In this case, we find that both next to leading order 
and discreetness corrections (to the local model) are needed at the 10 — 20% level. 
Shot-noise can be corrected with a term Ug/n where < 1, i.e., always smaller than 
the Poisson correction. We also compare these results with the peak-background split 
predictions from the measured halo mass function. We find 5-10% systematic (and 
similar statistical) errors in the mass estimation when we use the halo model biasing 
predictions to calibrate the mass. 



1 INTRODUCTION 

To do precision cosmology it is important to understand 
accurately galaxy bias, i.e., how the spatial distribution of 
galaxies is related to the underlying dark matter distribu- 
tion. Because galaxies are known to form in dark matter ha- 
los, its biasing can be approached in two natural steps. The 
first step is the bias between halos and dark matter. The 
second step is the bias between galaxies and halos, which is 
commonly approached by means of models of galaxy occu- 
pation in halos (see for instance Zheng et al 2005 & 2009, 
Brown 2008, Tinker 2006 & 2010). 

Biasing requires a complex modeling and in this paper 
we will focus on the first step only. This means that our 
findings might not be directly applicable to galaxy surveys. 
In the limit in which halo biasing resembles galaxy biasing or 
in the limit where observations are good tracers of the halo 
distribution (i.e., for galaxy groups or clusters) our results 
will be of direct relevance to the interpretation of clustering 
statistics in galaxy and cluster surveys. 



We will study the halo bias in a big cosmological dark 
matter simulation from the MICE collaboration (Fosalba 
et al. 2008, Crocce et al. 2009). Q We will address two 
main questions: a) how accurate is the so-called local bias 
model to predict clustering statistics, and b) how bias pre- 
dictions from the mass function compare with the ones in 
local model. In the process of answering these questions we 
will also learn about nonlinear and stochasticity contribu- 
tions to the halo variance. The local bias model, introduced 
by Fry and Gaztaiiaga (1993), assumes a general non-linear 
(but local and deterministic) relation between the smoothed 
density contrast in the distribution of halos (or galaxies) 
and the smoothed density contrast of the dark matter, i.e., 
Sh = In reality, bias is stochastic and not quite deter- 

ministic (eg., see Somerville et al. 1999, Tegmark & Bomley 
1999, Dekel & Lahav 1999) and at some level, due to tidal 
forces and evolution, it will have non-local and anisotropic 

^ For more information about the MICE collaboration team and 
the simulations, see http://ice.cat.es/mice/ 
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contributions. It is also not clear to what extent the halo or 
galaxy density should depend only on the underlying matter 
density, without including other direct dependencies (like 
the gravitational potential or velocity fields, for instance). 
Bias could also relate to mass at some initial condition or in 
Lagrangian space (see, Catelan, Matarrese & Porciani 1998, 
Matsubara 2008 and references therein). 

The bias parameters of the local model are the coeffi- 
cients of the Taylor expansion of F[S\, and depend on the 
halo and dark matter smoothing scale. In this paper we show 
that for halo samples with a minimum mass less than 10^* 
solar masses the bias parameters converge at a smoothing 
scale ~ 30-60 Mpc/h. Wc can then compare these local bias 
parameters obtained by directly fitting F[S\ in the simula- 
tions with the bias from clustering measurements like the 
two and 3-point correlations functions, the variance and the 
skewness. We will show that the local bias model works well 
at least within a few percent level. When this local model is 
applied to interpret real galaxy surveys it can be used to re- 
cover information about dark matter clustering and biasing 
parameters. 

One common way to predict the bias parameters is to 
use the peak background split Ansatz. Bias parameters are 
predicted from the mass function using few assumptions: lo- 
cality and also the assumption that the conditional mass 
function of an overdcnse (underdcnse) patch of the universe 
can be treated as if it were equal to the average mass func- 
tion of the universe at a different time, or mean density. Peak; 
background split predictions for the bias, specially from the 
Sheth Tormen mass function (Sheth and Tormen 1999), and 
the Press-Schechter mass function (Press & Schochtcr 1974, 
PS from now on) have been used a lot in the literature. In 
the second part of this paper we will compare these bias 
predictions with the bias from clustering, and with the local 
bias, and study their dependence on the halo mass thresh- 
old used to fit the mass function. The inaccuracy of the peak 
background split has also been studied in Manera, Sheth and 
Scoccimarro (2010) with complementary results to those of 
this work. 

The peak background split bias parameters predicted 
from the Press-Schechter (PS) mass function together with 
the assumption of the local bias was tested in a precursor 
paper by Mo, Jing and White (1997), where the local model 
was used to compare the skewness and higher order moments 
of halos in N-Body simulations with predictions and obser- 
vations, leading to the conclusion that the galaxies from the 
APM survey (as measured in Gaztanaga 1994) should not 
be highly biased. Mo, Jing and White used a small simula- 
tion of only 256 Mpc/h and 128'^ particles with plots that 
show no errorbars. In some of their plots, specially when 
halos are identified at the same time that moments are cal- 
culated, differences between theory and simulations could 
be interpreted as being significant for our current precision 
requirements. Unfortunately since they tested the PS bias 
parameters and the local bias model together, it is unclear 
how each assumption contributes to the mismatch. 

In a follow-up paper, Casas-Miranda, Mo and Boener 
(2003) redid the previous analysis, this time with the Sheth 
and Tormen mass function, and applied the results to the 



Lyman break galaxies at z = 3. Their plots of skewness 
and higher order moments still show no error bars, and 
differences between theory and simulations could amount 
more than 15% in some cases. Again, the question arises to 
whether the local bias is a good approximation or not inde- 
pendent of the bias prediction from the mass function, which 
requires extra assumptions and varies depending which mass 
function one decides to use. In our paper we cam separate 
these effects by obtaining the local bias parameters directly 
from a fit of the local bias relation F, thus testing the lo- 
cal model separately, from the bias predictions. A failure of 
the local bias model could point towards what other con- 
tributions should be included next (if any) when analysing 
observational data to the precision needed for the current 
generation of surveys. 

Another difference in our analysis with respect to the 
previous works above is that we study both moments (vari- 
ance and skewness) and 2 and 3-point correlation functions. 
Moments are closer to the local relation in that they are both 
smoothed (spherically averaged) quantities, so one would ex- 
pect better agreement for them. But they suffer from shot- 
noise (or discreteness effects) and stronger non-linear effects 
(as they include clustering on all scales smaller than the 
smoothing radius). The 2 and 3-point functions do not suf- 
fer from shot-noise and can better separate the effect of dif- 
ferent scales (because they are averaged over radial shells 
rather than integrated over spheres). Moreover, the 3-point 
function provides different information than the skewness. 
Both are related third order statistics, but the 3-point func- 
tion also gives shape information (i.e., how elongated are the 
triangles) which is missing in the skewness. Wc also study 
the 2-point cross-correlation of mass and halos which gives 
an idea of how important the stochasticity is in the bias 
relation. 

The relation between the mass function and the bias can 
be inverted. Consequently one may use the bias as a proxy 
for the mass of the halo sample. This is of direct relevance to 
the interpretation of observational data. Systematic errors in 
estimating the mass from the bias would propagate to, and 
broaden, the constraints on cosmological parameters (like 
the dark energy equation of state parameter w) when fitted 
to the estimated halo mass function. Notice that self cali- 
bration methods for the mass function, which are expected 
to be used by DES-like surveys, assume that wc know the 
mass-bias relation (Lima & Hu 2005, 2007). In this work we 
will assess how well the halo mass is recovered by using as 
input the clustering bias parameters from the two and from 
the 3-point correlation functions. 

Going one step further, to relate halos to galaxies, it 
has become customary to use the Halo Occupation Distri- 
bution (HOD) prescription, which consists of populating ei- 
ther theoretically, or in the simulation, the dark matter halos 
with galajcies, using some simple (author-variable) popula- 
tion function that usually depends only on three or four 
parameters. The HOD prescriptions are far from providing 
a few percent precision of all meeisurements. For instance, 
Scoccimarro et al. (2001) found that they were unable to 
match both the variance and the skewness of APM galax- 
ies. Since a local model of biasing is assumed along with the 
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bias prediction from the mass function, it is unclear if this 
disagreement is due to the HOD choice or to the failure of ei- 
ther the local model or the bias predictions. It is therefore of 
direct observational interest to assess each step separately, 
which is what this paper starts doing. 

Indications that more work is needed to construct reli- 
able galaxy mocks has been given by Guo and Jing (2009). 
Using a semianalytical mock sample of galaxies constructed 
from an N-Body simulation, they compared the local bias 
parameters from clustering with the local bias predicted us- 
ing the peak background split Ansatz plus an HOD, which 
were found to be significantly different. Such difference may 
arise from the fact that the authors were using a published 
prescription instead of fitting their own HOD function, but 
part of the disagreement could come as well from the local 
bias and the PBS Ansatz. 

Finally, note that we only study clustering in configu- 
ration space. Bias will most likely have different effects in 
Fourier space, in particular regarding shot-noise effects and 
stochasticity (see eg. Tinker et al. 2010, Seljak, Hamaus & 
Desjacques 2009, Cai, Bernstein & Sheth 2010, and refer- 
ences therein). 

The paper is organized as follows. Section II gives a 
brief introduction to the simulations. In section III we study 
the performance of the local bias model in simulations and 
present a study of shot-noise and next to leading order con- 
tributions. In section IV we compare the clustering of halos 
in the simulations with the peak backgrounds split predic- 
tions and check how well we can recover the mass of halos 
from the bias parameters. We present our conclusions in sec- 
tion V. 



2 NUMERICAL SIMULATIONS 

In this paper we work with the comoving data from the 
MICE intermediate dark matter simulation, which has a vol- 
ume of 1/ = (1536Mpc/hf and N = 1024^ particles, and 
consequently a mass resolution of 2.34 IO^Mq/Zi This sim- 
ulation have been run with Marenostrum at the Barcelona 
Supercomputer Center using the L-GADGET code, peri- 
odic boundary conditions, and 128 processors. In this paper 
we use the z — 0.0 and the z = 0.5 comoving outputs. 
The cosmological model parameters for the simulation are 
flm = 0.25, = 0.75, Qb = 0.044, n« = 0.95, h = 0.7 and 
as = 0.8. The softening length of the simulation is 50 Kpc/h. 
The initial conditions were set at z = 50 using Zeldovich 
approximation. Halos have been found using a Friends of 
Friends algorithm with a linking length 0.168 times the mean 
interparticle distance, which results in 2729833 halos of more 
than 20 particles at z=0, and 2110669 halos at z=0.5. The 
effect of chancing linking length have been studied in Man- 
era, Sheth and Scoccimarro 2010. By working with comov- 
ing data we concentrate on the gravitational evolution and 
structure formation and get rid of redshift distortions and 
other lightcone effects, which might not be directly related 
to the questions addressed here. Nevertheless, since at the 
end we want to model observational data, the inclusion of 
lightcone effects and redshifts distortions have to be consid- 



ered as the natural next step in this study (see also Marin 
et al 2008). 



3 LOCAL BIAS PERFORMANCE 

A simple model for halo or galaxy bias was introduced by 
Fry and Gaztanaga (1993). These authors assumed that the 
density contrast in the halo (or more generally in the galaxy) 
distribution Sh can be expressed as a non-linear function 
of the local density contrast of dark matter, 5m. , so that 
Sh ~ F[Sm]- On large enough smoothing scales, where the 
fluctuations are small, this relation can be expanded in a 
Taylor series 

oo 

Sh = F[Sm] = = + ^1^™ + + ■ • ■ (1) 

where Sm{r) is the local density contrast at position r 
smoothed over a given characteristic Rs scale. With this 
local bias model, one can compute the 2 and 3-point halo 
biased correlation functions, to find (Fry & Gaztanaga 1993; 
Frieman & Gaztanaga 1994) 

^"(^2) = < 5h{ri)5h{r2) > ~ 6? ^(^2) 

(33('-12,r23,?-i3) ~ -!- [Q3(''12,r23,r3i) -I- C2] (2) 
Ol 

where C2 = fe2/foi and ^('"12) =< 5miri)5m{r2) > is the 
2-point matter correlation function. The distance ri2 corre- 
sponds to the separation of two arbitrary positions 1 and 2. 
The hierarchical 3-point function, Qa has 3 such distances 
and is defined as 

' - [ e('-i2)e(r23) + C(n2)C(ri3) + ^(ri3)C(r23) ] ^ ' 

We need three parameters to specify the triangle formed 
by the 3 positions ri, r2 and r^. We will fix two of these 
sizes (ri2 and ris) and show the results as a function of 
Q, the angle between ri2 and ri3. In general Q3{a) has a 
characteristic U or V-shape (see Fig|6|: is larger for small 
and large angles than for intermediate values. 

There is ambiguity over what we should use as smooth- 
ing scale Rs in Eq. [T] A common and natural choice is that 
Rs should be smaller than ri2. But we will show that this 
does not provide a good model for ri2 < 60 Mpc/h. An- 
other possibility, that we will support here, is to think of 
an effective Rs that can be larger than ri2. Correlations are 
estimated from spatial averages over very large volumes and 
one can then think of Eq[l] as some average transformation 
over the whole volume. In this sense this bias transforma- 
tion just provides an effective description which is only local 
over a very large smoothing radius. 

When the correlation distances are zero we recover the 
corresponding relations between smoothed variances and 
skewness (see Eqjsjl. 

al-bla^ ; ~ {S3 + 3c2)/bi (4) 

Here it is common to identify Rs with the smoothing scale 
in variance and skewness, but this does not need to be the 
case. 
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Figure 1. Scatter plots showing lialo density contrast <5(,, smootlied over top-hat cells, for halos of 50 or more particles versus dark 
matter density fluctuations 5^ smoothed over the same cells. Results are shown for a different cell sizes with equivalent radius Rg as 
labeled in the figure. Results are for simulation data at redshift z=0 (left panels) and z=0.5 (right panels). In a continuous line we show 
the least square fit to the local bias parabola. 
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Figure 2. Variation of bi (left panels) and C2 (right panels) as a function of the smoothing radius R. Top panels corresponds to 2 = 0.5 
and bottom panels to 2 = 0. Results are shown for different minimum number of particles per halo, n. In each panel from bottom to top 
=25 (black), n=50 (red), n=100 (green), n=200 (blue) and n=400 (yellow). Being the particle mass 23.42 X 10^°Mq it yields, after 



correcting for resolution effects, minimum masses of 0.5 (black), 1.06 (red), 2.19 (green), 4.49 (blue), and 9.11 (yellow) lO^^^^-^ 



In the above equations, and also in Eq|2j the ~ sign indi- 
cates that this is the leading order in ^. Note that in general, 
even when 5™ ^ 1, the linear bias prescription (i.e., using 
only 61) is not accurate for higher-order moments like Qa, 
the reason being that nonlinearities in bias (i.e. C2) generate 
non-Gaussianities of the same order as those of gravitational 
origin. In general, to predict higher order correlations in halo 
(or galaxies) to order N the local relation has to be expanded 
to order A*' — 1 (Fry & Gaztanaga 1993). In this paper we will 
only study the clustering to order = 3 which means that 
bias only needs to be quadratic in the local model. Thus, in 
practice, we will be testing the following model: 



Sh = bo + blSm + -^^m 



+ e 



= 61 5r 



(5) 



where e represents the scatter around the local relation (and 
also includes higher order contributions in 5m)- Because we 
require < 5h >~ 0, we have feo = —b2<^m~ < e > and define 
= e- < e >. 

One important prediction of this local model is to ex- 
pect the shape of the correlation to be unaffected by bias (or 



in other words that the effective bias is constant) on large 
scales: 



C'(n2) 



bi ari2) + 0[e{ri2)] 



(6) 



The next to leading contribution to above is proportional 
to ^2 and consequently negligible at large ri2 where ^ < 1. 
This will be tested below in section |3.3[ 

It is in principle possible to use the shape of in sim- 
ulations (or observations) to separate bi from C2 = b2/b\. 
This is done by a fit of the halo (or galaxy) measurements 
of Q3 (of) in Eq[2] using the corresponding dark matter pre- 
dictions or measurements Q3.{a). Changing 61 will produce a 
distortion of the U-shape of Q3 (as a function of a), while C2 
only produces a constant shift. Thus, unless is constant 
within the errors or b\ is very large, one could simultane- 
ously measure 61 and C2 from Q3 (Frieman & Gaztanaga 
1994; Fry 1994). This idea will be tested below in section 
§3.4[ One could also predict &i from the ratio of the halo 
to dark matter correlations: b\ = (,h/^m, but this requires 
knowledge of the normalization of the dark matter cluster- 
ing amplitude in , which is often what we want to fit from 
observations. The fit to Qa can produce an estimate of the 
linear bias foi which is independent of the overall amplitude 
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of clustering ^m, because the Q3 prediction is independent 
of the normalization. This approach has already been imple- 
mented for the skewness S3 (Gaztanaga 1994, Gaztanaga & 
Frieman 1994), the bispectrum (Frieman & Gaztanaga 1994, 
Fry 1994, Feldman et al. 2001, Verde et al. 2002) or the angu- 
lar 3-point function (Fry 1994, Frieman & Gaztanaga 1999, 
Gaztanaga & Scoccimarro 2005, Gaztanaga et al. 2005). 

3.1 Measurements in 5h vs 5„i scatter plot 

We are interested in exploring and determining the local 
bias parameters directly. In order to do so we will compare 
the halo density contrast Sh with the corresponding local 
matter density fluctuation Sm at the same cells. We will do 
this for all cells in the simulation. This will give us an scatter 
plot of the relation 5h = ^'['Jml from which we can obtain bi 
and C2 by means of a least mean square fit to the local bias 
parabola from the Taylor expansion of 5h in Eqj5] 

Scatter plots of halos of more than n=50 particles are 
shown in Figjl] for a selection of sizes of the cubical cell 
{Ic = 24, 48, 128 Mpc/h), which correspond to spherical top 
hat volumes of radius = 14.9, 29.8, 79.4 Mpc/h as labeled 
in the figures. Left and right panels show results for z = 
and z = 0.5 respectively. It is apparent how the quadratic 
bias C2 changes sign from convex (c2 < 0) to concave (c2 > 0) 
as the redshift increases. 

One prominent feature in the plots is the discreteness of 
the 5h values, i.e., that we see horizontal lines in the figures. 
This obviously comes from the fact that we have an integer 
number of halos in each cell. The step in the halo density 
fluctuations is consequently Adh = 1 /n, where n is the mean 
number of halos in the cells. This is the value of the Poisson 
shot-noise, which will decrease when increasing the cell size 
or when lowering the mass threshold of halos, for we will 
have a larger n. The matter density field is also discrete, 
but because the large number of matter particles per cell 
this effect is not visible in the plots. 

3.2 Smoothing scale 

The bias parameters obtained from the least mean square 
fit depend on the size of the cell used to smooth the density 
field, therefore the issue of what smoothing radius to use 
when comparing with clustering bias should be addressed. 
First, notice that for the smallest smoothing radius the scat- 
ter of points is very big. In this case many points have 
5m > 1, which situate us in a regime where the Taylor ex- 
pansion of -F[5m] can not be applied. When the radius is 
set to a larger value the scatter gets reduced and almost 
all points have S,n < 1, situating us within the perturbation 
regime and producing a particular fit of the bias parameters. 

Our results of the dependence of the bias on the smooth- 
ing radius for several halo minimum masses are presented in 
Figi 

As expected, we see that the values of bi and C2 change 
significantly as we increase the smoothing radius 7?^ from 5 
to 20 — 25 Mpc/h, from where they start to converge to their 
large scale values. The convergence is reached faster at lower 
mass thresholds and, for a fixed mass, at lower redshifts. In 
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Figure 3. Symbols with JK errorbar show the 2-point correlation 
function 5(r) from simulations for different minimum number of 
particles (A'^ > 25, 100 or 400) per halo as labeled in the figure. 
The bottom errorbars corresponds to the measurements in the 
DM distribution. The bottom continuous (dashed) lines in each 
panel shows the RPT (linear) theory prediction. The upper con- 
tinuous lines show the best fit amplitude for the RPT prediction 
shape, whose amplitudes are shown in the bottom labels. Top 
(bottom) panel correspond to z=0 (z=0.5). 



our study we will take smoothing radius of 30 and 60 Mpc/h, 
where the convergence regime has been reached. 

Through all the paper errors on the measured bias pa- 
rameters have been computed using the jack-knife method 
with 64 subsamples of the density fiuctuations field. This 
is, we first compute the density fluctuations using the true 
mean density of the simulation and then we create the jack- 
knife subsamples from which we obtain, using these fluc- 
tuations, a set of 64 bias. Applying equation [26] gives the 
estimated jack-knife error. We have check that changing the 
number of regions does not change results signiflcantly. 
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Figure 4. Bias from tiie ratio of 2-point correlation function ^(r) 
for different minimum number of particles per fialo N=25, 100, 
400 (from bottom to top). Top panel shows results for z = and 
bottom panel for z = 0.5. The dashed lines show the values of the 
linear bias fit in the range 25 < r < 40Mpc/h. 



3.3 Comparison with 2-point correlations 

We have computed the 2-point correlation function ^(r) for 
the matter and halo density contrast in the simulation. To 
estimate (,{r) we have used the 4Mpc/h density mesh of 
the simulation and average all the mesh points separated 
by (r ± Ar), where Ar = 0.5Mpc/h (see Barriga & Gaz- 
tanaga 2002 for details). The results for the matter corre- 
lation function and for different halo masses (given by the 
minimum number of particles per halo) are shown in Figjs] 
The top panel shows the z = Q case and the bottom panel 
the z = 0.5. To convert particles to halo mass remember 
that Mp = 2.34 10"Mq//i. 

As expected the more massive the halos the more biased 
the correlation function. Note as well what is called the sta- 
ble clustering, i.e., the fact that for a given halo mass thresh- 
old the absolute value of ^ remains approximately constant 
in redshift while the matter correlation function decreases 
(in redshift). This could be understood however because ha- 
los of a given mass but at different redshift do not correspond 



to the same Lagrangian mass. The ones at higher redshift 
are situated in a rarer (less expected) matter fluctuations, 
being therefore more biased. 

The measured correlation function from the simulation 
shows very clearly the acoustic peak at about ~ llQMpc/h 
for both the matter and the halo functions. For comparison, 
in this figure we have also plotted the Linear Perturbation 
Theory (PT) prediction (dashed lines) and the Renormal- 
ized Perturbation Theory (RPT) prediction (continuous) for 
the correlation function, which has been kindly provided by 
M. Crocce The RPT shows deviations of the linear theory 
at much larger scales that have been previously thought and 
even in the acoustic peak scale one gets a contribution of the 
nonlinear effects (Crocce & Scoccimarro 2008). As can be 
seen in the figure these nonlinear contributions results into 
a smoother prediction for the acoustic peak shape in the 
RPT that is in better agreement with what we find in the 
simulations. 

We find the bias from 



b{r) 



5'"(r) 



(7) 



This bias is expected to be constant at large scales in the 
local bias model of Eq|6] Cosmic variance and shot-noise will 
add variations to this large scale constant bias. Both errors 
get more pronounced for larger scales (where we have few 
modes in the simulation) and for larger halo mass thresholds 
(since the number of halos is smaller). This can be seen 
in Fig|4] where we plot as a function of separation for 
redshifts z = Q and 2 = 0.5 and different mass thresholds. 
We do not find any evidence in the data for scale variations 
of b for r > 20 Mpc/h. This favors the local bias model, but 
note that this statement is only accurate within the ~ 10% 
accuracy in our analysis. 

We do a fit to a constant 6(r), weighted by the inverse 
variance, for different range of scales. The result is shown 
as continuous lines in Fig|3]and triangles in Fig[5] The bias 
from f(r) is slightly larger when we fit to smaller scales of 
20 — AOMpc/h, but results are consistent within errors. We 
can see in this later figure that, within its errors, the bias 
from clustering is in good agreement with the local bias de- 
termined directly from the 5^-5^ relation at larger scales. 
The values in the figure correspond to cell size Rs — 60 
Mpc/h where the bias in Fig[2] has reach its asymptotic 
value for all masses. The agreement is no so good for smaller 
smoothing scales. Even for cells as large as Rs = 30 Mpc/h 
we find some deviations in 6 for large masses. This clearly 
indicates that the local bias prescription in Eq[l] is to be 
understood as an effective relation smoothed over very large 
scales and it fails when we try to apply it as a truly local 
transformation (where Rs < ri2, with Rs < 60 Mpc/h). Also 
note that at these large smoothings the stochastic compo- 
nent Sc (see Eq[5]l is small as illustrated in Fig. [Tjand that, 
in particular we can neglect the stochasticity correlation be- 
tween two different points < 5^{ri)5i{r2) > in the modeling 
of the 2-point correlation. 
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Figure 6. Halo biases from the reduced 3-pt function Q3(r23, ri2, a) for z=0 (left set of panels) and z=0.5 (right set of panels) with 
r23 = 2ri2 = 48 Mpc/h. Each column corresponds to a different halo mass (as labeled in the top panels). TOP: in dark matter as 
measured in the simulation (blue lines) as compared to Qg in halos (errorbars) of the same simulation. Long dashed red lines show the 
local bias model predictions (equationl2|) for the best fit values of fei and C2 shown in the bottom panels. BOTTOM: contours in the 
bl — C2 plane for Ax'^ = 1,2.3 and 6.17. Best fit values are found by matching the measured in halos (symbols in top panels) with 
predictions in the local bias model, ie Qg = (Q™ + C2)/bi where are the dark matter values ( blue continuous lines in top panels). 
Filled squares show the values of h\ — C2 from the local bias scatter plot 5^ Figplat R=60 Mpc/h. 
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12.5 13 13.5 14 

Log halo Mass cut (units M^^j/h) 

Figure 5. Comparison of different estimates for the linear bias 
as a function of the minimal halo mass. Continuous line corre- 
spond to the local model fit to the scatter relation <5 

m — o/i in 

Fig|2] at R=60Mpc/h. Triangles correspond to bias from the 2- 
point function on large 30-80 Mpc/h scales (open triangles) and 
intermediate 20-40 Mpc/h scales (filled triangles). 



3.4 Comparison with 3-point correlations 

We have computed the hierarchical relation Q3{a) (see equa- 
tion [2]) for dark matter and halos in the simulation (as for 
2-point function we follow Barriga & Gaztanaga 2002). We 
use triangles with fixed r23 = 2ri2 = ASMpc/h and 7-13 
given by the angle a between r23 and ri2. Some results for 
z = (left) and z — 0.5 (right) are shown in Fig|6] Dark 
matter measurements are shown as (blue) contiimous lines 
while halo measurements correspond to errorbars. Errorbars 
in dark matter are negligible as compare to errors in the halo 
distribution, which is dominated by shot-noise. The stan- 
dard perturbation theory prediction for Q3 is quite close to 
the DM measurements on these large scales. Notice the char- 
acteristic U shape in Q3(q). This is an indication of filamen- 
tary structure, i.e, aligned structures (a ~ 0, a ~ 180 deg ) 
are more probable than perpendicular configurations (for in- 
stance, equilateral triangles). Spherical structures will pro- 
duce constant values of (53(a). As the bias increases, the 
distribution becomes less filamentary and this information 
can be used to measure the bias. 

We have fitted the shape of Q3 in simulations to 61 and 
C2 in Eq[2] using the corresponding dark matter measure- 
ments Q3 (we follow the procedure described in Gaztanaga 
& Scoccimarro 2005) Changing 61 produces a distortion of 
the U-shape of Q3, while C2 only produces a constant shift. 
The fits are shown as contours in the bottom panel of Fig[6] 
and they are compare with the values of 61 and C2 (squares) 
from the scatter plot in Fig[2] at Rg = 60 Mpc/h. For er- 
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rorbars we use the JK covariance matrix. This matrix is 
degenerate because of the strong correlations of different a 
bins. To be safe we only use the two principal components 
with larger eigenvalues (see Gaztaiiaga & Scoccimarro 2005). 
This is quite conservative in terms of the size of the result- 
ing errorbars. Better estimates would require a more careful 
study of the covariance matrix, which is beyond the scope 
of this paper. 

The values of 61 recovered from Q3 (squares) for dif- 
ferent mass thresholds are shown in Fig[7] There is good 
agreement in the general tendency of & as a function of mass 
but there are some significant deviations for small masses 
(logAf < 13). This failure of the local biasing model for Q3 
is intriguing in the light of the very good agreement that we 
found from ^ in FigjH This is an important point to clarify 
because we do not know foi in the real universe and we were 
hoping to be able to use the values of 61 from to find 
the dark matter normalization of ^. According to Fig[7]this 
will produce a significant (2-sigma level) deviation for small 
halo masses. 

This mismatch can hardly be attributed to the stochas- 
tic component (which includes also non-local contribu- 
tions). As in the case of the 2-point function, because the 
smoothing radius in the local model is very large, we ex- 
pect the stochastic correlation components to be subdomi- 
nant (see section above) . A key difi'erence between the 2 and 
the 3-point function is that the former takes isotropic aver- 
ages while the later keeps anisotropic information (some- 
thing which is not captured either by the skewness, see be- 
low, which is a third order statistics but is smoothed in 
spherical cells). So our finding hint in the direction that we 
need some anisotropic component to the halo biasing model 
in Eqjl] at least for logM ~ 13. This conclusion might not 
be generic. For biasing in galaxy mock catalogs where 6 ~ 1, 
corresponding to lower mass thresholds in the halo picture, 
Gaztaiiaga & Scoccimarro (2005) and Marin et al. (2008) 
found good agreement of the values of 61 coming from ^ and 
Q3 clustering under the local model. More work needs to be 
done to clarify these issues. 



3.5 Comparison with the variance and skewness 

So far we have studied the bias from 2-pt and 3-pt correla- 
tion functions because they do not suffer from the discrete- 
ness effects that appear in the variance and the skewness. 
However, the latter are closer to the local model assump- 
tions (since they prove a local smoothed quantities). Since 
they bring different aspects to the comparison we will also 
study them here. 

We define the variance and skewness m'^ as second 
and third order moments of the fluctuation field: 

N N 

= = ^ E*^- = = ^ E^'' (8) 

where the sum is over a fair sample of points in the simu- 
lation (ergodic assumption). In this case, one typically con- 
siders these quantities as a function of the smoothing radius 
R. It is also convenient to define the normalized skewness: 



\ \ \ \ ^ \ \ \ \ 1 i \ \ 1 ^ r 

scatter (5^ — 6^ 




12.5 13 13.5 14 

Log halo Mass cut (units M^^,/h) 

Figure 7. Comparison of difEerent estimates for the linear bias as 
a function of the minimal halo mass. Continuous line correspond 
to the local model fit to the scatter relation (5m — in Fig[2]at 
Rs = 60Mpc/h. Open squares come from fitting the 3-pt function 
Qs, i.e., see Fig|6] 

^3 - 5 (9) 
3. 5. 1 Variance 

One of the common ways of determining the linear bias of 
galaxies or halos is by comparing their variance with the 
measured/predicted matter variance. However, to do the 
correct comparison one has to account for the shot-noise 
contribution (a very similar problem occurs in the estima- 
tion of the power spectrum which is the variance in Fourier 
space). This contribution to the variance appears because 
galaxies and halos are not a continuous fields, but discrete 
ones. For a top hat window function, Wji(r-) = 0(| r \ — R) 
the Poisson shot-noise is well known and it is equal to 1/n 
where n is the mean number of halos in a sphere of radius 
R. The shot-noise corrected variance is therefore: 

a"(i?) 5' > ~- (10) 
n 

where R stands for the window function smoothing scale. 
The dark matter in the simulation is also a discrete field 
and, as mentioned before, it will have its own shot-noise 
correction, which will obviously be much smaller than the 
halo one due to its higher number density. Now, we can use 
the halo variance to compute an estimator for the linear bias 
as 

, _ (t{R) I < ShSh > -1/n 

Ohh = = \ f — f (11) 

O-m(H) V < OmOm > 

Another estimator for the linear bias that can be computed 
from the simulation is 
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h^^ poisson corrected 

local linear bias 

h^^^ nonlinear estimation 

bj^j^ nonlinear estimation 
with poisson correction 
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Figure 8. Bias as a function of halo mass. The linear bias bi 
(shown as black lines) is estimated from a fit to the scatter plot (5;j- 
Sm in the simulations. This is compared with the bias values ob- 
tained from the (shot-noise corrected) variance fe^^ = cr/crm (red 
squares) and from the cross-correlation b/j^ =< 5m&h > /""m 
(blue triangles). Also shown are the predictions for b^^ and b/i^ 
after applying non-linear contributions (i.e., in Eq |16| and Eq |17[ |. 
Results are shown for both 2 = (bottom lines and symbols) and 
2 = 0.5 (on top). 



bhm = 



< 5h5m > 



(12) 



In Fig[8]we show the values for the different bias esti- 
mators computed in cubical cells of side Ic = 48 Mpc/h. The 
variance in a cubical cells is very similar to the one in a top 
hat smoothing sphere of equal volume as the cube (Baugh, 
Gaztanaga & Efstathiou 1995). For our cells of side Ic — 48 
Mpc/h the spherical equivalent radius is R = 29.8Mpc/h. 
Errors in the figure are from the Jack-knife method with 64 
regions, and we have checked that changing the number of 
regions does not change results significantly. 

We can see that all three bias estimators 6i , bhh and bhm 
give significant different results given the errorbars. Conse- 
quently one needs to be cautious when trying to use these 
bias estimators for precision cosmology where errors lower 
than 10% are sought. Below we discuss the origin of these 
differences focusing mainly in non-linear and discreteness 
effects, which we find are the dominant effects. Other con- 
tribution could arise from the truncation of the Taylor ex- 
pansion. 



3.5.2 Skewness 

An important clustering statistic for understanding 
quadratic bias is the skewness. As all the moments and cu- 
mulants of the halo field it has to be shot-noise corrected. 
For the normalized skewness this correction is found to be 
(eg. see Gaztanaga 1994): 



3r 

2.5- 
2 - 
1.5 - 

■1 

1- 
0.5 - 



from scatter plot 
from S-^ and bj scatter 
from S-^ and poisson con'ected ^ 



, from S, and b, 




4e+13 
M [M 



Figure 9. Dependence of C2 on the halos mass as measured di- 
rectly from the 5h-5rn local relation in the simulation (black lines) 
compared with the values obtained from the skewness and three 
different linear bias estimates in Eq |15| fei from the local relation 
(red squares), (blue diamonds) and bhm (pink triangles). Top 
(bottom) set of lines are for z = (2 = 0.5). 



< > -3a^(J?)/n 



l/n^ 



(13) 



where a is again the shot-noise corrected variance and R 
stands for the window function smoothing scale. Note that 
when comparing the measured skewness to predictions one 
has to take into account the fact that we are smoothing 
the density field. For a top hat smoothing and CDM power 
spectrum the normalized skewness can be approximated by 
(Juszkiewicz, Bouchet & Colombi 1993; Cooray & Sheth 
2002; Bernardeau et al. 2002) 



5-3 



: 4 -f ^n; 



2/63 



+ 71 



(14) 



where 71 — ^^^^^(^yy^- Obviously for the Einstein-de-Sitter 
cosmology and no smoothing we recover the well known 
value in the spherical collapse model 34/7 (see Fosalba & 
Gaztanaga 1998 for the interpretation in terms of the spher- 
ical collapse model). 

With the skewness and the linear bias we can easily 
compute C2 as (see section §2) 



(15) 



Here we can either use the direct local 61 as measured from 
the 5h — 5m scatter plot or other estimators of the linear 
bias as bhm or bhh. Results are shown in Fig[9] and com- 
pared with the C2 obtained directly from the 5m — 5h scatter 
plot fit. Errors for these points are computed by means of 
the Jack-knife method with 64 subsamples in the simulation. 
As in the case of the variance we find significant deviations 
between the different estimators. Next-to-leading order con- 
tributions as well as modeling stochasticity would be needed 
for precision cosmology. 
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3.6 Non-Linear effects and stochasticity 



In order to asses how good the linear approximation is we 
compute the nonlinear contribution to the linear bias bhm 
and bhh using Eq[5] We start with bhm which should be 
subject to smaller discreteness effects. The next order in 
is: 

1, 



Ohm — f f — Ol 

< OmOm > 



-b2S3a^ + bt 



(16) 



where b^ 



and 5t 



£—<£>. Because of sym- 



metry reasons, b^ can be expected to be very small, as we 
will show next. 

Nonlinear corrections in Eq |16| seem to account well for 
the difference that we saw in FigIS] between the measured 
bhm (blue triangles) and the linear bias bi (black continu- 
ous line) from the fit to the scatter plots. The non-linear 
correction to bhm in Eq |16| is also shown in Fig[8]as a blue 
line (for b^ = 0) and it overlaps well with the bhm measure- 
ments within errors. The nonlinear terms are therefore large 
(10-15% effect) and certainly have to be taken into account 
in precision cosmology. We infer from this very good agree- 
ment that the contributions from the scatter b^ in Eq |16| and 
the effect of the Taylor truncation (i.e., higher orders in the 
expansion) are negligible given the errors. 

The corresponding corrections for bhh is: 



1,2 

bhh 



biS-i + \b2 + 



-b-zStal^ 



b20-m + £h 



(17) 



where the second term includes all the non-linear cor- 
rections and the third term is: 



£hh + 



(18) 



which only includes terms involving the scatter. As pointed 
out above, because of symmetry, we expect linear terms in 
Se to vanish so that < 5^5^ >~ 0,n = 0, 1,2. This is well 
supported by the good agreement that we found above be- 
tween bhm in Eq 16 (with b^ ~ 0) and measurements in Fig|8] 



But this might not be necessarily the case for the quadratic 
term < 5^ > because there is no cancellation between posi- 
tive and negative fluctuations. The 1/n term comes from the 
shot-noise correction (i.e., Eq |ll[ ) which allows us to move 
from the discrete to the continuous halo variance; it assumes 
that halos are a Poisson sample of the dark matter field. If 
all scatter < &^ > in the local relation were just Poisson, 
then we expect that £hh —< 5^ > — 1/n ~ 0. 

In Fig. [8] we show how the non-linear corrections in 
Eq |17| fail to explain the difference between bhh and b\. The 
predicted bhh (dashed line) is higher that bhm (blue tri- 
angles) while the measured one (red squares) is lower. In 
fact, the nonlinear terms seem to increase the differences 
between the predicted and measured bias. This could be 
explained if £hh turns out to be negative, which would hap- 
pen if the scatter is sub-Poisson (smaller than Poisson) and 
consequently we overcorrected shot-noise it by using l/fi 
term. Sub-Poisson shot-noise have been found in simulations 
(Casas-Miranda et al. 2002) for halos larger than M*[^ 

^ Also note that the same effect seems to result in super-Poisson 
errorbars (Cabre & Gaztanaga 2009). This two statements are 




^ 2x\q" 4x10'^^ 6x10'^ 8x10'^ Ixio"* 
M [ M /h] 



Figure 10. Comparison of the Poisson shot-noise correction 1/n 
(continuous line) and the scatter < (5^ > (dashed line) in the local 
bias. There are two sets of lines, one for each redshift as labeled in 
the figure (larger values corre spo nd to 2 = 0). Dotted lines show 
(jj , the ratio of the two in Eq 19 



We have indeed found that our halo simulations have 
< (Jf > which is smaller than 1/n. This can be seen in 
Fig |10| which compares the two terms. Besides shot-noise 
or discreteness effects < 5^ > also include other sources 
of scatter: non-deterministic bias and possibly higher order 
contributions than the quadratic terms in Eq[5] The later is 
a smoothed component and is unlikely to result in a major 
increase in the actual scatter. Fig |10| therefore indicates that 
the final scatter is overestimated by the Poisson model. We 
can write the new effective shot-noise term as: 



< 5? 

n 



10 



(19) 



For small 



where is plotted as dotted lines in Fig 
halo masses cr^ tends to unity, while it is roughly constant 
(jj ~ 0.6 — 0.8 for larger masses. 

In Fig |ll| we apply the discreteness correction to the 
prediction rather than to the measurements (which are not 
corrected here for Poisson shot-noise). When we use the 
new estimate for the scatter, i.e., £hh — < tJ^ > we find a 
very good match between the predictions and the measure- 
ments for bhh, We can see here that, as happened for bhm 
in Fig[8] non-linearities are also important for the variance. 
The main difference between bhm and bhh is that the later 
also needs a shot-noise correction that is different from Pois- 
son, at least for large halo masses. Both discreteness and 
and non-linearities are needed to interpret the bias from the 
variance. 



not in contradiction because the former refers to the Poisson cor- 
rection to the mean variance (in Fourier or configuration space) 
while the later refers to the noise or error (around the mean 2- 
point function) induced by discreteness noise. 
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Figure 11. Bias in bf^fi as a function of halo mass as in Fig|8] 
but here we do not apply the Poisson shot-noise correction to the 
measurements of fe^^. When apply instead the discreteness correc- 
tion to the predictio ns ( this correction estimated from the scatter 
< (5j > shown in Fig |l0[ |. We only find a good agreement between 
the non-linear predictions (red lines) and measurements (squares) 
after both discreteness and non-linear terms are included. 



3.7 Cross correlation and stochasticity 

A simple measure to study deviations away from the local 
linear bias relation has been pointed out by several authors 
( see Tegmark & Peebles 1998, Dekel & Lahav 1999, Seljak, 
Hamaus & Desjacques 2009, Cai, Bernstein & Sheth 2010, 
and references therein) . This is to consider the dimensionless 
cross-correlation coefficient between the distribution of mass 
and galaxies (we use halos as a proxy for galaxies in our 
case) : 

r = < > - (20) 

V< SmSm X ShSh > 

which is in general a function of scale. In the local linear 
bias model r — 1. But both non-linearities and stochastic- 
ity (the scatter around the local relation) can change this 
away from one. Note that this test is fundamentally differ- 
ent from previous test of the local bias. This test focus on 
how important is the stochasticity in the bias relation. For a 
deterministic function one expects r — 1, but when the scat- 
ter in the 5h — 5m relation is large one would expect that it 
could have a different impact in both parts of this ratio. 

For 1-point smoothed fields we have that in our notation 
(see Eq |llp2| ) this corresponds to: 

r = ^ (21) 
bhh 

We can estimate this quantity directly from the variance 
measured in simulations. The result is shown in Fig. [12] as a 
function of halo mass. We compare measurements without 
any correction (lower lines) and using two different ways 
to correct for scatter and discreteness effects in the halo 
variance: the Poisson corrected variance: < 5^ > —1/n and 



ID 

2 

>, 



0.8 



poisson corrected 



scatter corrected 



z = 0.0 
z = 0.5 



without conecting 



2x10 



4x10 



6x10 



8x10 



1x10 



M [M /h] 

sun 



Figure 12. Dimensionless cross-correlation coefficient r in Eq |20| 
as a function of halo mass. This is for 1-point fluctuations 
smoothed over cells of ~ 3QMpc/h radius. Different pairs of 
lines show results using different ways to correct for the discrete- 
ness in the halo variance. Dashed (continuous) lines correspond 
to z = 0.5 (z = 0.0). 
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Figure 13. Symbols with errorbars show r in Eq |20| i.e., the 

dimensionless cross-correlation coefficient between dark matter 
and halos with M > 5 X 10^^ in the MICE simulation at z = 0.5. 
The continuous line correspond to the scale dependence bias bf^f^, 
normalized to the mean value. 



the scatter corrected variance < Sf-^ > — < >. The cross- 
correlation deviates significantly from unity if we do not 
correct from these effects. Deviations increases with halo 
mass and redshift, and can be as large as 20-30% for large 



halos. As shown before (eg. see Fig 10 1 the Poisson model 
does not provide a good correction for the scatter. If we use 
instead the scatter away from the local relation, as measured 
in the simulation, we recover values which are close to unity. 
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We can also estimate r in tiie 2-point correlation func- 
tion, which should be less affected by discreteness effects. 



Fig f3 shows r as a function of scale (separation between 

In FigP" 



f3 



we esti- 



pairs) for halos with M > 5 x 10 Mq. 
mate JK errors from the r ratio, i.e., we estimate the ratio 
in different JK subsamples and calculate the error from the 
scatter in the JK regions (this produces smaller errorbars be- 
cause sampling variance mostly cancels in doing the ratio). 
For comparison we also show in this figure (continuous line) 
how much b^h deviates from a constant (i.e., from Fig|4|. 
The measurements are compatible with unity for all scales. 
There is a hint of a deviation (~ 3%) around the BAO scale 
which could be related to recent findings about scale de- 
pendence bias (eg. see Desjacques et al. 2010 and references 
therein). 

Similar results, but with much larger errors, are found 
for larger halos masses and different redshifts. Note that 
for masses larger than IO^^Mq Manera et al. 2010 found 
bhm/bhh to be slightly larger than unity, with bhm measured 
at low k in Fourier space and bhh at large separations in the 
autocorrelation function. 

Altogether, our analysis indicates that the linear local 
bias model provides a very good approximation, within our 
sampling errors, for the 2-point function. On scales larger 
than r ~ 20Mpc/h, the halo-halo correlation and halo-mass 
correlations are, to a good approximation, linear tracers of 
the underlying dark-matter correlation function and the re- 
sulting bias is just the one expected in linear theory. This 
conclusion is important to interpret measurements of red- 
shift space distortions and BAO in galaxy surveys, which 
on large scale usually are interpreted under the assumption 
that linear bias and linear theory are good approximations. 

This is not so much the case for the variance, which 
seems more affected by non-linearities and discreetness ef- 
fects. This is understood from the fact that the variance 
(as well as the power spectrum) is quadratic in fluctuations 
and is an average over all scales, including small, non-linear, 
scales. 



4 BIAS PREDICTIONS FORM THE MASS 
FUNCTION AND THEIR PERFORMANCE 

In the previous section we have studied how the local bias 
performs when compared with the bias measured from clus- 
tering. In this section we will compare those bias with pre- 
dictions from the mass function. 



4.1 Bias predictions from the mass function 

In the peak-background split Ansatz (Bardeen et al. 1986; 
Cole & Kaiser 1989) one can relate the halo bias with the 
halo mass function at large scales by treating perturbed re- 
gions as if they were unperturbed regions in a slightly dif- 
ferent background cosmology universe but one of the same 
age (Martino & Sheth 2009). 

Consequently, from a well motivated functional form of 
the mass function, one can derive theoretical predictions for 
the halo bias parameters as well as study their accuracy (Mo 



et al. 1997, Scoccimarro et al. 2001, Cooray & Sheth 2002, 
Manera et al. 2010). 

In this paper we will use the Sheth and Tormen (1999) 
mass function: 



n{m)dm 



Pm 



f{u)du 



A{p) (1 + {qv)-^) 



1/2 



Gxp 



qp 



(22) 
.(23) 



where A{p) = [1 -I- (2^^(1/2-^))/^"^ is the normalized 
amplitude. The corresponding bias predictions are 



&i(m, z) = 1 4- ei + -Bi 

b2(m,z) = 2{1 + a2){ei + El) + €2 + E2 



(24) 



where a2 — —17/21, and 



El = 



qu 



2p/5sc{z) 

1 + {qpy 



£2 



qv f {qvY — Gqy -\- 3 



Ei l + 2p 



Saciz) 



+ 2ei 



(25) 



Throughout the paper ly — Ssc{z)/{D'^{z)aQ(m)). In this 
notation D{z) is the growth factor in units of its value at 
z = 0; (Jo{m) is the linear variance of the matter field at red- 
shift z — 0, when smoothed with a top hat filter of radius 
R = {3mp/4:Tv)~^^^ and (5sc(z) is the critical density contrast 
for collapse at a given redshift z. Although it is popular in 
the literature to use a fixed value for Ssc we will be using its 
proper redshift dependence from the spherical collapse (Eke 
et al. 1996, Cooray & Sheth 2002) since there is some indica- 
tions that in this case the mass function closer to universal 
(Manera et al. 2010). 

These predictions for the bias depend on the mass func- 
tion through the parameters p and q. When p = and q ~ 1 
we recover the Press-Schechter (Press & Shechter 1974) for- 
mula. Original values for this mass function fit were p ~ 0.3 
and q ~ 0.7 (Sheth & Tormen 1999) which discusses after- 
wards that q = 0.75 (and therefore A ~ 0.3222) gives better 
results (Sheth & Tormen 2001; Cooray & Sheth 2002). We 
confirmed that this is the case and consequently we will use 
the latter values as their fiducial values for the ST. At the 
same time we will also use our own set of p and q values ob- 
tained by fitting the mass function as we explain in section 
|4.2| Bias parameters from a mass function with a functional 
form like that Warren et al. (2006) has been studied by Man- 
era et al. (2010) and showed to give similar results than that 
of ST for a range of masses similar to that of this paper. 

The above bi{m) predictions are for a given halo mass, 
but the simulation results are for halos above a mass thresh- 
old, consequently in this paper we integrated these predic- 
tions over the mass range, weighting appropriately accord- 
ing the number of halos at each mass. If one has a model 
for populating galaxies in halos one can weight each halo 
by the number of galaxies given by the halo occupation dis- 
tribution (HOD), and therefore obtain a prediction for the 
galaxy bias. For an approach of how this can be done see for 
instance, Sefusatti & Scoccimarro (2005) or Tinker & Wetzel 
(2010). In this paper we are interested in separating these 
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two steps and understanding the errors that come from the 
halo predictions for clustering. 



4.2 Mass function fits 

We have computed the mass function of halos for the MICE 
simulation. We show it in Fig |14| Halos have been found 
using a Friends of Friends algorithm with a linking length 
0.168 times the mean interparticle distance, and their masses 
have been corrected for discreteness effect following Warren 
et al. (2006), i.e., the mass of the halo have been set equal to 
Mp N{1 — N~^'^) where N is the number of particles and Mp 
the particle mass (which is 2.34 10"Mq in our simulation). 
The Warren correction has been experimentally set using a 
linking length equal to 0.2 times while we are using 0.168. 
Differences in the correction, however, are very likely to be 
minimal if not negligible for the halo mass range in which we 
fit the mass function. Notice also that Crocce et al. (2009) 
has tested this correction for MICE simulations by means of 
randomly removing a fraction of the dark matter particles 
as a way of lowering the mass resolution, and found it to 
work quite well. 

We have performed a fit to the mass function data, 
starting from different lower mass thresholds for halos. Best 
fits for Sheth and Tormen (ST) functional form are shown 

(top panel), while data is 



14 



as dashed colored lines in Fig. 
in black dots. The fits are dominated by the lower mass bins 
which have smaller errorbars. For comparison we have added 
a line showing the mass function with the commonly used 
ST fiducial values (p, g) = (0.3,0.75). To appreciate better 
the differences between fits we show, in the bottom panel, 
the ratio of the best fit curves to that of the ST fiducial case. 

The values for p and q of each fit and their statistical 
errors are shown in the Table 1. Errors come from jack-knife 
subsampling and are computed in the following way. We 
divide our simulation in 64 compact regions with equal vol- 
ume. Then we create a set of J = 64 jack-knife subsamples 
of the data by removing each time one of these regions from 
the whole sample. For each jack-knife subsample we com- 
pute the mass function and fit its (p, q) parameters. Errors 
are then obtained as 



(J 



J 



(26) 



where 5 is a generic name of any of the parameters in which 
we are interested, in this case p and q, and 6 is the average 
of 9 over the jack-knife subsamples. And for the best fit 
values of our parameters we take 6. When doing the fit, 
errors in the mass functions are taken to be Poisson but 
results do not change significantly if they are estimated by 
the jack-knife method as well. Similar results are obtained 
if we divide the simulation in 27 jack-knife regions instead 
of 64. As it is shown in Table 1 we find that jack-knife 
errors on of p and q are smaller than the systematic errors 
that we are trying to asses by setting different halo mass 
thresholds. 



log{Mmin) 
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13.0 
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0.334 


0.665 


0.001 


0.003 


13.5 
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0.309 


0.733 


0.002 


0.004 


14.0 


0.0 


0.275 


0.786 


0.004 


0.006 


13.0 


0.5 


0.347 


0.691 


0.001 


0.003 


13.5 


0.5 


0.312 


0.763 


0.003 


0.004 


14.0 


0.5 


0.280 


0.801 


0.010 


0.011 



Table 1 Best fit values of the Sheth and Tormen's p and 
q parameters to the simulation mass function, and their 
jack-knife errors Op and ct,. 

4.3 Comparison witli tlie local model 

We compare the bias predictions from the mass function 
fits with the measured local bias from scatter plots, in Fig. 
|15| Both &i and C2 are shown as a function of halo mass, 
and for both redshifts that we are studying. We find that, 
generically, the predicted linear bias b\ falls below the local 
bias. This happens for all three mass thresholds we use to 
fit the mass function. 

The best agreement between the linear local bias and 
the predictions is when the mass function is fitted for masses 
above 10"Mq. The lower the mass threshold to fit the mass 
function the worst the agreement between measurements 
and predictions. For a threshold of M > 10^'^ Mq predictions 
are completely misplaced, for a threshold of M > 10^^'^Mq 
we have differences of about 5-10%, while if the threshold 
is M > 10^*Mq differences are of few percent. This few 
percent agreement however have to be taken with caution 
because we are using the high mass halo tail to predict the 
bias of a halo sample in which most of the halos had not 
contributed to the ST fit. 

In the same Fig. |15[ for comparison with most ST plots 
in the literature, we have shown also the predictions for the 
fiducial ST case of p = 0.3 and q = 0.75. Its performance 
is similar to the one with a threshold of 10^^'^Mq, i.e, with 
differences about 5-10% with the local bias measurement. If 
we where to use the values p — 0.3 and q = 0.707 that also 
exist in the literature it would yield much lower values of bi , 
thus we confirm the convenience of using higher values for q 
as suggested in Sheth and Tormen (2001). 

We show the C2 values from the scatter plot (black dots 
with errorbars) in the bottom plot of Fig. 15 against the 
ST predictions from the mass function fit (dashed lines). As 
expected C2 errors from the scatter plot are larger than er- 
rors in 6i since it is more difficult to fit the second order of 
the Taylor expansion than the first one. For the predictions, 
statistical errors have been computed using Jack-Knife sub- 
sampling (also for foi) but they are not shown because they 
are much smaller than the systematics we see by changing 
the mass threshold. 



The only exceptions are halos above 7 • IO^^Mq at z = 0.5, 
where predictions seems to be above measurements. This is be- 
cause at these masses convergence in the biasing parameters as a 
function of Ra (i.e., in Fig[2]| has not been reached for Rs = 30. 
For this halos, we have checked that if we use a higher smoothing 
radius (i.e, Rs = 60 Mpc/h) we recover the general trend where bi 
measure in the scatter plot is above ones from the mass function. 
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Figure 14. TOP: Mass function for halos in the MICE simulation at redshifts z = (upper curves) and z = 0.5 lower curves. Compare 
them with ST best fits starting at log(M) = 13.0,13.5 and 14.0. BOTTOM: Ratios of the MICE mass function fits and data respect 
Sheth and Thormen mass function with p = 0.3 and q = 0.75. 
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Figure 15. Variation of h\ (top panel) and C2 (bottom panel) 
as a function of the halos mass. In black we show the values 
measured directly from the &h-&m local relation in the simulation 
with a smoothing of ij = 30Mpc/h and compare them with ST 
predictions (dashed lines) and the fiducial ST p=0.3 q=0.75 case 
(dots). As labeled in the figure each panel have both 2 = and 
z = 0.5 results. 



4.4 Comparison with clustering 

So far we have compared the scatter plot bias values both 
with bias from clustering statistics (section [3| and with ST 
predictions (this section) . This comparisons have allowed the 
study of the local bias model. Since the local bias is not a 
direct observable in observations we now proceed to compare 
directly the bias predictions from the mass function with the 
bias from clustering. This comparison for &i and C2 is shown 
in Fig. |16[ For reference we have also included the fiducial 
ST prediction with p = 0.3 and q = 0.75 

We find that the clustering of both bhm and bhh are 
sUghtly higher than the ST predictions. Recall that we have 
shown that the Poisson shot-noise correction does not work 
for bhh- The correct shot-noise correction is smaller, see 
Eq |19[ and produces values of bhh that are close to bh,n Thus 
the apparent agreement between bhh and the mass function 
predictions for z = is just a fluke and one should only com- 
pare to bhm which is not affected by shot-noise. The values 
of C2 are also affected by the shot-noise correction. 
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Figure 16. Comparison of the linear and second order bias from 
clustering with that of ST predictions from the mass function fit. 
Errors are from Jack-knife method with 64 regions. Smoothing 
radius of ij = 30Mpc/h 



Similar differences between predictions and measure- 
ments where found by Manera et al. (2010) when studying 
the large scale bias from other set of simulations. If we are 
looking only at the bhh, ST predictions work at 5-10% level 
at z = 0.5 As we will comment in section |4.5| this could be 
enough to calibrate mass of halos at about the same per- 
cent level. For greater precision more elaborate modeling is 
needed. 



4.5 Halo mass estimation 

We now explore the potential use of linear bias b measure- 
ments to calibrate the mass threshold of a halos sample. 
For a given mass-bias {b-M) relation in the halo model (i.e., 
Eq 24 1 we can use the measurements of bias b in the halo 



sample to predict the corresponding mass threshold. This is 
illustrated in Fig|17| We have used the clustering biases mea- 



© 0000 RAS, MNRAS 000, 000-000 



Local Bias 17 



6^ 



60 



40 



20 



-SO 



z = 0.5 Mass rccovc 



red with j!t fit to 



\ 

\ 

\ 

\ 



















\m (R = 30Mpc/h) 

l2(r,3=30-80Mpc/h) 

q3(24-72Mpc/h) 



13 13.5 
True Mass (log M/U^^^h) 



14 



Figure 17. Relative error in the Mass recovered using bias mea- 
surements from clustering together with the bias-Mass relation in 
the peak-background split model obtained from ST fit to MF for 
logM > 14. Short-dashed line corresponds to the bias from the 
cross-variance f)^^ in section §2.5. Continuous line correspond to 
the bias from the 2-point function on large 30-80 Mpc/h scales. 
Long dashed line come from Q3 in section §2.3. 



sured in the 2-point function (i.e., Figj5| and in the variance 
(i.e., Fig[8| at z—0.5 and use the mass-bias relation from the 
the ST fit to logM > 14 (which seems to provide the best 
fit to data) to cahbrate the mass from bias. For the variance 
we use bhm (the halo-mass cross variance) rather than bhh 
to avoid discreteness effects. 

The idea of recovering the mass function from bias and 
variance measurements, and subsequently fit for cosmologi- 
cal parameters, have been explored by Lima and Hu (2004, 
2005, 2007) in the so-called self calibration method. They as- 
sume a peak background split prediction to relate the bias to 
the mass function (in particular Eq |24| with the fiducial ST 
values), and allow for an scatter relation between the proxy 
of the mass (eg. X-ray Temperature) and the true mass. Fig. 
[17] clearly shows that there is a bias in the recovered mass, 
which will propagate into the cosmological fits as a system- 
atic error. This bias in the recovered mass, could in principle 
be corrected with the use of mock samples or the non-linear 
corrections presented above. 

To measure the bias based on the 2-point statistics we 
need to know erg. Otherwise we just recover the value of M 
in units of ag. This is not the case for Q3, which provides 
M with independence of ag, but at the expense of a larger 
errorbar and more systematic effects for small masses. For 
large masses there are too few halos to have a reliable mea- 
surement of Q3. Also note that observations are in redshift 
space while here we have only show results in real space. 
We expect differences in redshift space and we defer this to 
future studies. 



5 CONCLUSIONS 

In this paper we have used a cosmological dark matter sim- 
ulation of volume V = (1536Mpc//i)'^ from the MICE sim- 
ulation team to study the halo clustering and bias of halos 
above 2 • W^^A'Iq. We have focused in clustering in config- 
uration space (as oppose to Fourier space): the 2 and 3- 
point correlation function, the variance and skewness and 
the halo-mass cross-correlations. Our main results can be 
summarized as follows: 

• We have looked at the local deterministic biasing pre- 
scription, which assumes a local non-linear relation T = f{S) 
between mass fiuctuations, S, and its tracer, T. In simula- 
tions this relation is an approximation with significant scat- 
ter around the mean f{S) relation. We have fitted this scat- 
tered relation with a parabola and found the linear bias 61 
and the quadratic bias, 62 (or equivalently C2 = 62/61) at 
different smoothing scales (see Figs 1 & 2). We show that 
constant biasing values are reached for smoothings larger 
than 30 — 60Mpc/h. This provides a new interpretation for 
the so-called local model: it local only on average over very 
large scales. This has an immediate application for bias cal- 
culations as one can set ii^ — >■ 00 in practice and neglect 
many of the next to leading order terms in a multipoint 
expansion. 

• We have measured the correlation function of halos, 
(,h{r), and compared it to the matter correlation function, 
finding that the bias is approximately constant at scales 
larger than 20Mpc/h (see Figs 2-3) as predicted by the local 
bias model. Given our errors, there is some room for a small 
(few percent) scale dependence at scales near the BAO. We 
have shown (see Fig. [5| that this bias in the correlation 
is very well matched by the local bias prediction from the 
scattered Sm-Sh parabola, when we use a large smoothing 
Rs (where convergence to constant values is reached) 

• We have measured the 3-point correlation function of 
halos and fitted its shape to obtain 61 and C2 — 62/61. We 
have shown (see Fig. [7| that the linear bias obtained from 
the 3-point correlation function does not quite match the 
bias of the 2-pt correlation function (or the local bias, which 
are the same within errors) at our lower mass bins: M < 
10^^ Mq/H ~ 50 particles. The 3-point predictions for 61 
follow well the qualitative behavior of bias as a function of 
mass and redshift but there are some systematic (2-sigma) 
deviations at the lower mass end with good agreement for 
M > lO^Mo/ft. For M > lO^Af©/^ ~ 500 particles, error 
bars start becoming too large to conclude. 

• We have measured the bias from the halo cross- variance 
bhm —< 5h5 > / < > and found that it differs from the 
local bias at about 10%, or even more for the most massive 
halos (see Fig.[8|. The true local bias can be recovered, if we 
include non-linear correction (using the measured 62 of the 
local model). This is in contrast to the bias from the 2-pt 
correlation function for which there was no need of including 
nonlinear terms. 

• We have measured the bias from the halo variance 6j;^ = 
(< 5^ > ^l/'^)/ < 5^ > and found that it is different from 
bhm and from the linear local bias. We have shown that in 
order to be able to predict bhh from the local bias we need to 
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take into account both nonlinear and stochastic effects (see 
Fig.|8f. 

• We have shown that the appropriate discreteness cor- 
rection to the variance is sub-Poisson, and found that its ra- 
tio to the Poisson term 1/n is approximately constant in our 
range of masses, (see Fig. |10[ ). Overcorrecting the variance 
with 1/n masks the nonlinear contributions, thus giving an 
estimated value of 61 apparently closer to that of the local 
bias (specially at z=0), as we show in Fig. [s] 

• We have fitted the mass function of halos with a 
Sheth and Thormen functional form and applied the peak- 
background split Ansatz to predict the bias parameters. 
These predictions depend significantly on the mass thresh- 
old used to fit the mass function and they give systemati- 
cally lower linear bias (about 5-10%) than that measured in 
clustering or local relation (see Fig. |16| and Fig |15[ ). 

• Finally, we have estimated the mass of halos from the 



measured bias (Fig. 17 1, showing that there is a systematic 



error when using the common ST peak-background split pre- 
diction. These systematic errors have to be taken into ac- 
count when recovering the mass function from clustering of 
halos, since they will propagate to the estimator of cosmo- 
logical parameters, like the dark energy equation of state. 

We can conclude from the above that the different bias 
predictions are only accurate to 5-10% level. In the case of 
the 2-point functions (auto and cross-correlations), the lo- 
cal model seems accurate and we find that the origin of the 
discrepancy lies in the peak-background prescription. This 
is not so clear for the 3-point function, where probably both 
assumptions contribute to the error. For the smoothed mo- 
ments, we find that next to leading order and discreetness 
corrections (to the local model) are needed at the 10 — 20% 
level. Although this accuracy might still be adequate for cur- 
rent data, where typical errors are 10-20% (eg. Norberg et al. 
2002, Zehavi et al. 2005, Gaztanaga et al. 2005, Nichol et al 
2006), more work needs to be done to narrow this to the per- 
cent level that will be likely needed in upcoming and future 
surveys for precision cosmology and better understanding of 
galaxy evolution. 
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